from autoviz.AutoViz_Class import AutoViz_Class
%matplotlib inline
AV = AutoViz_Class()
viz = AV.AutoViz("vax_demog_nationality.csv", sep=',')
Shape of your Data Set loaded: (98118, 55) ####################################################################################### ######################## C L A S S I F Y I N G V A R I A B L E S #################### ####################################################################################### Classifying variables in data set... Data cleaning improvement suggestions. Complete them before proceeding to ML modeling.
| Nuniques | dtype | Nulls | Nullpercent | NuniquePercent | Value counts Min | Data cleaning improvement suggestions | |
|---|---|---|---|---|---|---|---|
| full_malaysia | 3855 | int64 | 0 | 0.000000 | 3.928943 | 0 | |
| partial_malaysia | 3845 | int64 | 0 | 0.000000 | 3.918751 | 0 | |
| booster_malaysia | 3021 | int64 | 0 | 0.000000 | 3.078946 | 0 | |
| partial_indonesia | 870 | int64 | 0 | 0.000000 | 0.886687 | 0 | |
| full_indonesia | 867 | int64 | 0 | 0.000000 | 0.883630 | 0 | |
| partial_bangladesh | 665 | int64 | 0 | 0.000000 | 0.677755 | 0 | |
| full_bangladesh | 657 | int64 | 0 | 0.000000 | 0.669602 | 0 | |
| date | 621 | object | 0 | 0.000000 | 0.632911 | 158 | |
| booster_indonesia | 616 | int64 | 0 | 0.000000 | 0.627815 | 0 | |
| booster_bangladesh | 593 | int64 | 0 | 0.000000 | 0.604374 | 0 | |
| full_philippines | 524 | int64 | 0 | 0.000000 | 0.534051 | 0 | |
| partial_philippines | 505 | int64 | 0 | 0.000000 | 0.514686 | 0 | |
| booster2_malaysia | 497 | int64 | 0 | 0.000000 | 0.506533 | 0 | |
| partial_myanmar | 491 | int64 | 0 | 0.000000 | 0.500418 | 0 | |
| full_myanmar | 480 | int64 | 0 | 0.000000 | 0.489207 | 0 | |
| full_nepal | 403 | int64 | 0 | 0.000000 | 0.410730 | 0 | |
| partial_nepal | 402 | int64 | 0 | 0.000000 | 0.409711 | 0 | |
| booster_myanmar | 396 | int64 | 0 | 0.000000 | 0.403596 | 0 | |
| partial_missing | 393 | int64 | 0 | 0.000000 | 0.400538 | 0 | |
| full_missing | 377 | int64 | 0 | 0.000000 | 0.384231 | 0 | |
| partial_other | 336 | int64 | 0 | 0.000000 | 0.342445 | 0 | |
| full_other | 328 | int64 | 0 | 0.000000 | 0.334291 | 0 | |
| partial_india | 317 | int64 | 0 | 0.000000 | 0.323080 | 0 | |
| booster_nepal | 316 | int64 | 0 | 0.000000 | 0.322061 | 0 | |
| full_india | 311 | int64 | 0 | 0.000000 | 0.316965 | 0 | |
| booster_philippines | 254 | int64 | 0 | 0.000000 | 0.258872 | 0 | |
| booster_missing | 253 | int64 | 0 | 0.000000 | 0.257853 | 0 | |
| booster_other | 248 | int64 | 0 | 0.000000 | 0.252757 | 0 | |
| booster_india | 233 | int64 | 0 | 0.000000 | 0.237469 | 0 | |
| partial_pakistan | 224 | int64 | 0 | 0.000000 | 0.228297 | 0 | |
| full_pakistan | 217 | int64 | 0 | 0.000000 | 0.221162 | 0 | |
| booster_pakistan | 166 | int64 | 0 | 0.000000 | 0.169184 | 0 | |
| full_china | 164 | int64 | 0 | 0.000000 | 0.167146 | 0 | |
| full_vietnam | 162 | int64 | 0 | 0.000000 | 0.165107 | 0 | |
| partial_china | 160 | int64 | 0 | 0.000000 | 0.163069 | 0 | |
| district | 158 | object | 0 | 0.000000 | 0.161031 | 621 | |
| partial_vietnam | 157 | int64 | 0 | 0.000000 | 0.160011 | 0 | |
| partial_thailand | 146 | int64 | 0 | 0.000000 | 0.148800 | 0 | |
| full_thailand | 142 | int64 | 0 | 0.000000 | 0.144724 | 0 | |
| booster_china | 119 | int64 | 0 | 0.000000 | 0.121283 | 0 | |
| booster_vietnam | 117 | int64 | 0 | 0.000000 | 0.119244 | 0 | |
| booster_thailand | 87 | int64 | 0 | 0.000000 | 0.088669 | 0 | |
| booster2_other | 53 | int64 | 0 | 0.000000 | 0.054017 | 0 | |
| booster2_bangladesh | 45 | int64 | 0 | 0.000000 | 0.045863 | 0 | |
| booster2_myanmar | 43 | int64 | 0 | 0.000000 | 0.043825 | 0 | |
| booster2_missing | 35 | int64 | 0 | 0.000000 | 0.035671 | 0 | |
| booster2_indonesia | 32 | int64 | 0 | 0.000000 | 0.032614 | 0 | |
| booster2_philippines | 30 | int64 | 0 | 0.000000 | 0.030575 | 0 | |
| booster2_nepal | 28 | int64 | 0 | 0.000000 | 0.028537 | 0 | |
| booster2_china | 26 | int64 | 0 | 0.000000 | 0.026499 | 0 | |
| booster2_india | 19 | int64 | 0 | 0.000000 | 0.019364 | 0 | |
| state | 16 | object | 0 | 0.000000 | 0.016307 | 621 | |
| booster2_vietnam | 12 | int64 | 0 | 0.000000 | 0.012230 | 0 | |
| booster2_thailand | 12 | int64 | 0 | 0.000000 | 0.012230 | 0 | |
| booster2_pakistan | 9 | int64 | 0 | 0.000000 | 0.009173 | 0 |
55 Predictors classified...
No variables removed since no ID or low-information variables found in data set
30 numeric variables in data exceeds limit, taking top 30 variables
List of variables selected: ['partial_malaysia', 'full_malaysia', 'booster_malaysia', 'booster2_malaysia', 'partial_indonesia', 'full_indonesia', 'booster_indonesia', 'booster2_indonesia', 'partial_bangladesh', 'full_bangladesh', 'booster_bangladesh', 'booster2_bangladesh', 'partial_myanmar', 'full_myanmar', 'booster_myanmar', 'booster2_myanmar', 'partial_philippines', 'full_philippines', 'booster_philippines', 'booster2_philippines', 'partial_nepal', 'full_nepal', 'booster_nepal', 'booster2_nepal', 'partial_india', 'full_india', 'booster_india', 'booster2_india', 'partial_pakistan', 'full_pakistan']
Total columns > 30, too numerous to print.
Number of All Scatter Plots = 465
Image size of 1500x87200 pixels is too large. It must be less than 2^16 in each direction.
Could not draw Pair Scatter Plots
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) File ~\AppData\Roaming\Python\Python310\site-packages\IPython\core\formatters.py:338, in BaseFormatter.__call__(self, obj) 336 pass 337 else: --> 338 return printer(obj) 339 # Finally look for special method names 340 method = get_real_method(obj, self.print_method) File ~\AppData\Roaming\Python\Python310\site-packages\IPython\core\pylabtools.py:152, in print_figure(fig, fmt, bbox_inches, base64, **kwargs) 149 from matplotlib.backend_bases import FigureCanvasBase 150 FigureCanvasBase(fig) --> 152 fig.canvas.print_figure(bytes_io, **kw) 153 data = bytes_io.getvalue() 154 if fmt == 'svg': File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backend_bases.py:2308, in FigureCanvasBase.print_figure(self, filename, dpi, facecolor, edgecolor, orientation, format, bbox_inches, pad_inches, bbox_extra_artists, backend, **kwargs) 2301 bbox_inches = rcParams['savefig.bbox'] 2303 if (self.figure.get_layout_engine() is not None or 2304 bbox_inches == "tight"): 2305 # we need to trigger a draw before printing to make sure 2306 # CL works. "tight" also needs a draw to get the right 2307 # locations: -> 2308 renderer = _get_renderer( 2309 self.figure, 2310 functools.partial( 2311 print_method, orientation=orientation) 2312 ) 2313 with getattr(renderer, "_draw_disabled", nullcontext)(): 2314 self.figure.draw(renderer) File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backend_bases.py:1559, in _get_renderer(figure, print_method) 1556 print_method = stack.enter_context( 1557 figure.canvas._switch_canvas_and_return_print_method(fmt)) 1558 try: -> 1559 print_method(io.BytesIO()) 1560 except Done as exc: 1561 renderer, = exc.args File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backend_bases.py:2204, in FigureCanvasBase._switch_canvas_and_return_print_method.<locals>.<lambda>(*args, **kwargs) 2200 optional_kws = { # Passed by print_figure for other renderers. 2201 "dpi", "facecolor", "edgecolor", "orientation", 2202 "bbox_inches_restore"} 2203 skip = optional_kws - {*inspect.signature(meth).parameters} -> 2204 print_method = functools.wraps(meth)(lambda *args, **kwargs: meth( 2205 *args, **{k: v for k, v in kwargs.items() if k not in skip})) 2206 else: # Let third-parties do as they see fit. 2207 print_method = meth File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\_api\deprecation.py:410, in delete_parameter.<locals>.wrapper(*inner_args, **inner_kwargs) 400 deprecation_addendum = ( 401 f"If any parameter follows {name!r}, they should be passed as " 402 f"keyword, not positionally.") 403 warn_deprecated( 404 since, 405 name=repr(name), (...) 408 else deprecation_addendum, 409 **kwargs) --> 410 return func(*inner_args, **inner_kwargs) File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backends\backend_agg.py:517, in FigureCanvasAgg.print_png(self, filename_or_obj, metadata, pil_kwargs, *args) 468 @_api.delete_parameter("3.5", "args") 469 def print_png(self, filename_or_obj, *args, 470 metadata=None, pil_kwargs=None): 471 """ 472 Write the figure to a PNG file. 473 (...) 515 *metadata*, including the default 'Software' key. 516 """ --> 517 self._print_pil(filename_or_obj, "png", pil_kwargs, metadata) File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backends\backend_agg.py:463, in FigureCanvasAgg._print_pil(self, filename_or_obj, fmt, pil_kwargs, metadata) 458 def _print_pil(self, filename_or_obj, fmt, pil_kwargs, metadata=None): 459 """ 460 Draw the canvas, then save it using `.image.imsave` (to which 461 *pil_kwargs* and *metadata* are forwarded). 462 """ --> 463 FigureCanvasAgg.draw(self) 464 mpl.image.imsave( 465 filename_or_obj, self.buffer_rgba(), format=fmt, origin="upper", 466 dpi=self.figure.dpi, metadata=metadata, pil_kwargs=pil_kwargs) File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backends\backend_agg.py:399, in FigureCanvasAgg.draw(self) 397 def draw(self): 398 # docstring inherited --> 399 self.renderer = self.get_renderer() 400 self.renderer.clear() 401 # Acquire a lock on the shared font cache. File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\_api\deprecation.py:384, in delete_parameter.<locals>.wrapper(*inner_args, **inner_kwargs) 379 @functools.wraps(func) 380 def wrapper(*inner_args, **inner_kwargs): 381 if len(inner_args) <= name_idx and name not in inner_kwargs: 382 # Early return in the simple, non-deprecated case (much faster than 383 # calling bind()). --> 384 return func(*inner_args, **inner_kwargs) 385 arguments = signature.bind(*inner_args, **inner_kwargs).arguments 386 if is_varargs and arguments.get(name): File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backends\backend_agg.py:416, in FigureCanvasAgg.get_renderer(self, cleared) 414 reuse_renderer = (self._lastKey == key) 415 if not reuse_renderer: --> 416 self.renderer = RendererAgg(w, h, self.figure.dpi) 417 self._lastKey = key 418 elif cleared: File ~\AppData\Roaming\Python\Python310\site-packages\matplotlib\backends\backend_agg.py:84, in RendererAgg.__init__(self, width, height, dpi) 82 self.width = width 83 self.height = height ---> 84 self._renderer = _RendererAgg(int(width), int(height), dpi) 85 self._filter_renderers = [] 87 self._update_methods() ValueError: Image size of 1500x87200 pixels is too large. It must be less than 2^16 in each direction.
<Figure size 1500x87200 with 435 Axes>
[nltk_data] Downloading collection 'popular' [nltk_data] | [nltk_data] | Downloading package cmudict to C:\Users\MuhammadimYus [nltk_data] | off\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\cmudict.zip. [nltk_data] | Downloading package gazetteers to C:\Users\Muhammadim [nltk_data] | Yusoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\gazetteers.zip. [nltk_data] | Downloading package genesis to C:\Users\MuhammadimYus [nltk_data] | off\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\genesis.zip. [nltk_data] | Downloading package gutenberg to C:\Users\MuhammadimY [nltk_data] | usoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\gutenberg.zip. [nltk_data] | Downloading package inaugural to C:\Users\MuhammadimY [nltk_data] | usoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\inaugural.zip. [nltk_data] | Downloading package movie_reviews to C:\Users\Muhamma [nltk_data] | dimYusoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\movie_reviews.zip. [nltk_data] | Downloading package names to C:\Users\MuhammadimYusof [nltk_data] | f\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\names.zip. [nltk_data] | Downloading package shakespeare to C:\Users\Muhammadi [nltk_data] | mYusoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\shakespeare.zip. [nltk_data] | Downloading package stopwords to C:\Users\MuhammadimY [nltk_data] | usoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\stopwords.zip. [nltk_data] | Downloading package treebank to C:\Users\MuhammadimYu [nltk_data] | soff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\treebank.zip. [nltk_data] | Downloading package twitter_samples to C:\Users\Muham [nltk_data] | madimYusoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\twitter_samples.zip. [nltk_data] | Downloading package omw to C:\Users\MuhammadimYusoff\ [nltk_data] | AppData\Roaming\nltk_data... [nltk_data] | Downloading package omw-1.4 to C:\Users\MuhammadimYus [nltk_data] | off\AppData\Roaming\nltk_data... [nltk_data] | Downloading package wordnet to C:\Users\MuhammadimYus [nltk_data] | off\AppData\Roaming\nltk_data... [nltk_data] | Downloading package wordnet2021 to C:\Users\Muhammadi [nltk_data] | mYusoff\AppData\Roaming\nltk_data... [nltk_data] | Downloading package wordnet31 to C:\Users\MuhammadimY [nltk_data] | usoff\AppData\Roaming\nltk_data... [nltk_data] | Downloading package wordnet_ic to C:\Users\Muhammadim [nltk_data] | Yusoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\wordnet_ic.zip. [nltk_data] | Downloading package words to C:\Users\MuhammadimYusof [nltk_data] | f\AppData\Roaming\nltk_data... [nltk_data] | Unzipping corpora\words.zip. [nltk_data] | Downloading package maxent_ne_chunker to C:\Users\Muh [nltk_data] | ammadimYusoff\AppData\Roaming\nltk_data... [nltk_data] | Unzipping chunkers\maxent_ne_chunker.zip. [nltk_data] | Downloading package punkt to C:\Users\MuhammadimYusof [nltk_data] | f\AppData\Roaming\nltk_data... [nltk_data] | Unzipping tokenizers\punkt.zip. [nltk_data] | Downloading package snowball_data to C:\Users\Muhamma [nltk_data] | dimYusoff\AppData\Roaming\nltk_data... [nltk_data] | Downloading package averaged_perceptron_tagger to C:\ [nltk_data] | Users\MuhammadimYusoff\AppData\Roaming\nltk_data. [nltk_data] | .. [nltk_data] | Unzipping taggers\averaged_perceptron_tagger.zip. [nltk_data] | [nltk_data] Done downloading collection popular
Could not draw wordcloud plot for date
All Plots done Time to run AutoViz = 1455 seconds ###################### AUTO VISUALIZATION Completed ########################